KMID : 1132720230210030040
|
|
Genomics & Informatics 2023 Volume.21 No. 3 p.40 ~ p.40
|
|
A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing
|
|
Kim Hyeon-Woo
Kim Ji-Won Choi Ji-Won Ahn Kwang-Sung Park Dong-Il Kim Sang-Soo
|
|
Abstract
|
|
|
Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of sequencing reads during quality control, particularly in datasets with large number samples. We present a streamlined pipeline that integrates FastP for read trimming, HmmUFOtu for operational taxonomic units (OTU) clustering, Vsearch for chimera checking, and Kraken2 for taxonomic assignment. To assess the pipeline¡¯s performance, we reprocessed two published stool datasets of normal Korean populations: one with 890 and the other with 1,462 independent samples. In the first dataset, HmmUFOtu retained 93.2% of over 104 million read pairs after quality trimming, discarding chimeric or unclassifiable reads, while DADA2, a commonly used ASV method, retained only 44.6% of the reads. Nonetheless, both methods yielded qualitatively similar ¥â-diversity plots. For the second dataset, HmmUFOtu retained 89.2% of read pairs, while DADA2 retained a mere 18.4% of the reads. HmmUFOtu, being a closed-reference clustering method, facilitates merging separately processed datasets, with shared OTUs between the two datasets exhibiting a correlation coefficient of 0.92 in total abundance (log scale). While the first two dimensions of the ¥â-diversity plot exhibited a cohesive mixture of the two datasets, the third dimension revealed the presence of a batch effect. Our comparative evaluation of ASV and OTU methods within this streamlined pipeline provides valuable insights into their performance when processing large-scale microbial 16S rRNA amplicon sequencing data. The strengths of HmmUFOtu and its potential for dataset merging are highlighted.
|
|
KEYWORD
|
|
amplicon sequencing, DADA2, HmmUFOtu, metagenomics
|
|
FullTexts / Linksout information
|
|
|
|
Listed journal information
|
|
|
|